Language overviews
For every language, this report includes a 3-panel overview plot of A the timing of turn-taking (for floor transfers only); B the duration in relation to transition timing of annotations (this provides a quick way to spot oddities in segmentation data); and C tokenised words ranked by frequency (with the top 10 displayed).
Plot axes are not standardized to make visible possible outliers. The figure panels are followed by some samples of conversations, randomly sampled from the larger corpus.
The remainder of the information comes in tables listing key characteristics of the corpus, including:
turns: number of annotations with timing information in the corpus, which in most corpora corresponds to the number of turns at talk
translated: the proportion of turns for which there is a translation available in English/French/German (on a scale from 0 to 1)
turnduration: mean duration of turns in this corpus
talkprop: sum of all annotation durations divided by length of source. If >1, indicates a densely annotated recording with quite some overlap. If <7, indicates less densely annotated recording and possibly untranscribed parts.
people: total number of distinct participants encountered in all source records for this corpus
hours: total number of hours (counting from the first transcription until the last by source)
turns_per_h: number of turns per hour in this corpus
Following this is a simple table of types of annotations encountered: at least talk, but possibly also laugh and breath (and sometimes NA). And finally there is a list of source files along with basic descriptive statistics per source.
+Akhoe Hai||om
Short name: akhoe_haikom; glottolog name: Hai//om-Akhoe; glottocode: haio1238; family/type: Khoe-Kwadi; macroarea: Africa
URL: https://hdl.handle.net/1839/ b1796725-1a49-48ee-93ea-75e5b440c7bc

0.5 hours
| 721 |
1 |
3253 |
1201 |
0.65 |
18 |
0.48 |
1502 |
samples

2 sources
| /akhoe_haikom1/Handcraft_3_Selection_complete |
131 |
1 |
709 |
8 |
0 |
0.8 |
3.4 |
0.06 |
| /akhoe_haikom1/state_hospital |
590 |
1 |
2544 |
10 |
0 |
0.5 |
25.3 |
0.42 |
Akpes (Àbèsàbèsì)
Short name: akpes; glottolog name: Akpes; glottocode: akpe1248; family/type: Atlantic-Congo; macroarea: Africa
URL: https://www.elararchive.org/dk0555

0.3 hours
| 635 |
0.49 |
3965 |
1753 |
1 |
2 |
0.3 |
2117 |
samples

2 sources
| /akpes1/ibe049-00s |
187 |
0.00 |
753 |
2 |
0 |
1 |
4.6 |
0.08 |
| /akpes1/ibe140-00s |
448 |
0.98 |
3212 |
2 |
0 |
1 |
13.1 |
0.22 |
Ambel
Short name: ambel; glottolog name: Waigeo; glottocode: waig1244; family/type: Austronesian; macroarea: Papunesia
URL: http://hdl.handle.net/2196/00-0000-0000-000C-E849-2

0.7 hours
| 1509 |
0.84 |
6601 |
1938 |
1.1 |
16 |
0.7 |
2156 |
annotation types
| [cough] |
3 |
| laugh |
102 |
| talk |
1386 |
| NA |
18 |
samples

5 sources
| /ambel1/AM056 |
146 |
0.97 |
747 |
5 |
0 |
1.0 |
5.9 |
0.10 |
| /ambel1/AM057 |
129 |
0.98 |
570 |
3 |
0 |
1.2 |
3.8 |
0.06 |
| /ambel1/AM064 |
674 |
0.73 |
3038 |
7 |
0 |
1.2 |
17.9 |
0.30 |
| /ambel1/AM067 |
484 |
0.67 |
1937 |
4 |
0 |
1.3 |
11.3 |
0.19 |
| /ambel1/AM107_0003 |
76 |
0.86 |
309 |
3 |
0 |
0.8 |
3.1 |
0.05 |
Anal Naga
Short name: anal; glottolog name: Anal; glottocode: anal1239; family/type: Sino-Tibetan; macroarea: Eurasia
URL: http://hdl.handle.net/2196/af2415d6-dc75-4330-ba5d-7b8122e50982

4.6 hours
| 6767 |
0 |
26826 |
1484 |
0.67 |
18 |
4.63 |
1462 |
annotation types
| [nod] |
1 |
| laugh |
12 |
| talk |
6010 |
| NA |
744 |
samples

15 sources
Showing only the first 10 sources; use allsources=T to show all
| /anal1/anm_20160916_PO_Wolring_1 |
538 |
0 |
2593 |
4 |
0 |
0.7 |
19.4 |
0.32 |
| /anal1/anm_20160917_LamphouPasna_Thotson_teashop |
1389 |
0 |
5027 |
6 |
0 |
0.8 |
38.1 |
0.64 |
| /anal1/anm_20160918_LCharu_Rockson_chat |
226 |
0 |
902 |
4 |
0 |
0.5 |
12.5 |
0.21 |
| /anal1/anm_20160918_LCharu_Rockson_dialogue_2 |
561 |
0 |
2007 |
2 |
0 |
0.9 |
14.8 |
0.25 |
| /anal1/anm_20160924_Thotson_grandmothers_1 |
117 |
0 |
542 |
2 |
0 |
0.8 |
4.3 |
0.07 |
| /anal1/anm_20161013_Jm_Dutang_lunch2 |
139 |
0 |
411 |
4 |
0 |
0.4 |
6.8 |
0.11 |
| /anal1/anm_20161014_PO_Darchol_evening_conversation |
170 |
0 |
601 |
3 |
0 |
0.5 |
9.9 |
0.16 |
| /anal1/anm_20161014_PO_Darchol_evening_conversation2 |
689 |
0 |
1998 |
4 |
0 |
0.5 |
20.1 |
0.34 |
| /anal1/anm_20161014_PO_Ralruwng_family_lunch1 |
600 |
0 |
1825 |
3 |
0 |
0.3 |
58.9 |
0.98 |
| /anal1/anm_20161014_PO_Ralruwng_family_lunch3 |
111 |
0 |
365 |
3 |
0 |
0.2 |
17.8 |
0.30 |
Egyptian Arabic
Short name: arabic; glottolog name: Egyptian Arabic; glottocode: egyp1253; family/type: Afro-Asiatic; macroarea: Africa
URL: https://catalog.ldc.upenn.edu/LDC97S45

20.3 hours
| 33120 |
0 |
201207 |
2190 |
1 |
8 |
20.3 |
1632 |
annotation types
| [cough] |
50 |
| breath |
83 |
| laugh |
491 |
| talk |
31605 |
| NA |
891 |
samples

140 sources
Showing only the first 10 sources; use allsources=T to show all
| /arabic1/4023 |
145 |
0 |
1538 |
2 |
3 |
1.0 |
10.1 |
0.17 |
| /arabic1/4150 |
198 |
0 |
1949 |
3 |
5 |
1.0 |
10.0 |
0.17 |
| /arabic1/4194 |
259 |
0 |
1667 |
3 |
5 |
1.0 |
10.1 |
0.17 |
| /arabic1/4213 |
146 |
0 |
1580 |
2 |
0 |
1.0 |
10.0 |
0.17 |
| /arabic1/4264 |
265 |
0 |
1813 |
3 |
8 |
1.1 |
10.6 |
0.18 |
| /arabic1/4283 |
181 |
0 |
1753 |
3 |
13 |
0.9 |
10.7 |
0.18 |
| /arabic1/4297 |
253 |
0 |
1605 |
2 |
9 |
1.0 |
11.4 |
0.19 |
| /arabic1/4299 |
121 |
0 |
1770 |
2 |
5 |
1.0 |
10.1 |
0.17 |
| /arabic1/4345 |
322 |
0 |
1643 |
3 |
13 |
1.0 |
10.1 |
0.17 |
| /arabic1/4367 |
238 |
0 |
1037 |
2 |
2 |
1.0 |
7.3 |
0.12 |
Arapaho
Short name: arapaho; glottolog name: Arapaho; glottocode: arap1274; family/type: Algic; macroarea: North America
URL: http://hdl.handle.net/2196/3bba11be-a5e2-47dd-bfe5-42f2ee9e0bf4

4.1 hours
| 4821 |
0.89 |
55850 |
2251 |
0.72 |
34 |
4.07 |
1185 |
annotation types
| [nod] |
2 |
| laugh |
3 |
| talk |
4808 |
| NA |
8 |
samples

32 sources
Showing only the first 10 sources; use allsources=T to show all
| /arapaho1/1 |
62 |
0.98 |
656 |
2 |
0 |
0.8 |
3.7 |
0.06 |
| /arapaho1/14a |
35 |
0.94 |
1062 |
5 |
2 |
0.3 |
7.5 |
0.12 |
| /arapaho1/14b |
12 |
1.00 |
37 |
3 |
0 |
0.6 |
0.6 |
0.01 |
| /arapaho1/14c |
7 |
0.86 |
18 |
3 |
0 |
0.7 |
0.2 |
0.00 |
| /arapaho1/14d |
13 |
1.00 |
22 |
4 |
0 |
0.6 |
0.4 |
0.01 |
| /arapaho1/14e |
15 |
1.00 |
78 |
4 |
0 |
0.8 |
0.4 |
0.01 |
| /arapaho1/14f |
33 |
0.97 |
293 |
4 |
0 |
1.0 |
1.2 |
0.02 |
| /arapaho1/14g |
69 |
1.00 |
651 |
5 |
0 |
0.8 |
2.5 |
0.04 |
| /arapaho1/14h |
9 |
1.00 |
91 |
3 |
0 |
0.7 |
0.3 |
0.00 |
| /arapaho1/17b |
44 |
0.45 |
200 |
7 |
0 |
0.6 |
1.7 |
0.03 |
Asimjeeg Datooga
Short name: asimjeeg_datooga; glottolog name: Isimjeega-Rootigaanga; glottocode: isim1234; family/type: Nilotic; macroarea: Africa
URL: http://hdl.handle.net/2196/1e9151d8-df0a-4ea7-bb6d-377b65b14310

0.4 hours
| 465 |
0 |
2843 |
2612 |
0.8 |
1 |
0.42 |
1107 |
samples
2 sources
| /asimjeeg_datooga1/IGS0229_2017-3-1_5 |
221 |
0 |
1521 |
1 |
0 |
0.8 |
12.9 |
0.21 |
| /asimjeeg_datooga1/IGS0229_2017-3-3_10 |
244 |
0 |
1322 |
1 |
0 |
0.8 |
12.4 |
0.21 |
Baa
Short name: baa; glottolog name: Baa; glottocode: kwaa1262; family/type: Atlantic-Congo; macroarea: Africa
URL: http://hdl.handle.net/2196/e050a2cd-f61d-435e-824e-93d24877bbaa

1.1 hours
| 1361 |
0.96 |
12553 |
2506 |
0.85 |
7 |
1.09 |
1249 |
samples

2 sources
| /baa1/KWB008 |
1003 |
0.99 |
9693 |
2 |
0 |
0.9 |
46.3 |
0.77 |
| /baa1/KWB033 |
358 |
0.94 |
2860 |
5 |
0 |
0.8 |
19.0 |
0.32 |
Besemah
Short name: besemah; glottolog name: Musi; glottocode: musi1241; family/type: Austronesian; macroarea: Papunesia
URL: https://hdl.handle.net/1839/00-0000-0000-0022-6B59-B

2.4 hours
| 5106 |
1 |
20316 |
1962 |
1.12 |
14 |
2.41 |
2119 |
samples

4 sources
| /besemah1/BES-20130426-HEN |
982 |
1 |
3035 |
9 |
0 |
1.2 |
26.4 |
0.44 |
| /besemah1/BES-20130506-HEN |
2292 |
1 |
8954 |
3 |
0 |
1.3 |
59.3 |
0.99 |
| /besemah1/BJM01-002-01 |
739 |
1 |
4297 |
3 |
0 |
0.9 |
31.0 |
0.52 |
| /besemah1/BJM01-015-01 |
1093 |
1 |
4030 |
3 |
0 |
1.1 |
27.5 |
0.46 |
Brazilian Portuguese
Short name: brazilian_portuguese; glottolog name: Brazilian Portuguese; glottocode: braz1246; family/type: Indo-European; macroarea: South America
URL: https://fale.ufal.br/projeto/nurcdigital/

1.5 hours
| 3242 |
0 |
17109 |
1633 |
1 |
2 |
1.47 |
2205 |
samples

1 sources
| /brazilian_portuguese1/NURC_RE_D2_340 |
3242 |
0 |
17109 |
2 |
0 |
1 |
88.4 |
1.47 |
Catalan
Short name: catalan; glottolog name: Catalan; glottocode: stan1289; family/type: Indo-European; macroarea: Eurasia
URL: https://catalog.elra.info/en-us/repository/browse/ELRA-S0407/

6.7 hours
| 11059 |
0 |
93827 |
1992 |
0.91 |
24 |
6.65 |
1663 |
annotation types
| [blow] |
12 |
| breath |
27 |
| laugh |
103 |
| talk |
10912 |
| NA |
5 |
samples

42 sources
Showing only the first 10 sources; use allsources=T to show all
| /catalan1/ca_f01r_m04r_fcd |
351 |
0 |
3079 |
2 |
0 |
1.1 |
10.1 |
0.17 |
| /catalan1/ca_f01r_m04r_tod |
406 |
0 |
3851 |
2 |
0 |
1.1 |
13.1 |
0.22 |
| /catalan1/ca_f01r_m04r_trd |
414 |
0 |
3495 |
2 |
0 |
0.9 |
16.4 |
0.27 |
| /catalan1/ca_f01r_m04r_und |
434 |
0 |
3621 |
2 |
0 |
1.0 |
15.2 |
0.25 |
| /catalan1/ca_f02a_m05a_fcd |
392 |
0 |
3139 |
2 |
0 |
1.2 |
10.1 |
0.17 |
| /catalan1/ca_f02a_m05a_tod |
483 |
0 |
3681 |
2 |
0 |
1.1 |
12.6 |
0.21 |
| /catalan1/ca_f02a_m05a_trd |
338 |
0 |
2424 |
2 |
0 |
1.0 |
10.0 |
0.17 |
| /catalan1/ca_f02a_m05a_und |
481 |
0 |
2665 |
2 |
0 |
1.0 |
11.0 |
0.18 |
| /catalan1/ca_f37s_f38s_fcd |
439 |
0 |
3885 |
2 |
0 |
1.0 |
12.6 |
0.21 |
| /catalan1/ca_f37s_f38s_tod |
182 |
0 |
1883 |
2 |
0 |
0.9 |
7.4 |
0.12 |
Chitkuli Kinnauri
Short name: chitkuli; glottolog name: Chhitkul-Rakchham; glottocode: chit1279; family/type: Sino-Tibetan; macroarea: Eurasia
URL: http://hdl.handle.net/2196/cf110665-3694-4e74-a8f8-79e105d89b50

1.1 hours
| 1123 |
0.9 |
13462 |
4560 |
1.25 |
15 |
1.14 |
985 |
samples

10 sources
| /chitkuli1/DEB_cik01-RK-BSN1-2018-10-15 |
88 |
1 |
1056 |
2 |
0 |
1.0 |
6.6 |
0.11 |
| /chitkuli1/DEB_cik03-GD-AS-2018-11-01 |
261 |
1 |
1936 |
2 |
0 |
1.1 |
14.0 |
0.23 |
| /chitkuli1/DEB_cik04-CRN-YS1-2018-11-22 |
41 |
1 |
805 |
2 |
0 |
1.0 |
3.9 |
0.07 |
| /chitkuli1/DEB_cik06-BS2-TS-2019-05-26 |
97 |
1 |
1958 |
2 |
0 |
1.6 |
7.0 |
0.12 |
| /chitkuli1/DEB_cik08-BSN2-HN-2019-05-28 |
193 |
1 |
2544 |
2 |
0 |
1.5 |
10.8 |
0.18 |
| /chitkuli1/NDB_cik01-VKN-NB1-2018-11-21 |
82 |
1 |
848 |
2 |
0 |
1.3 |
4.6 |
0.08 |
| /chitkuli1/NDB_cik09-SD3-SD4-2019-05-27 |
106 |
1 |
1671 |
2 |
0 |
1.7 |
6.5 |
0.11 |
| /chitkuli1/NDB_cik10-MB-RB1-2019-05-28 |
113 |
0 |
961 |
2 |
0 |
1.1 |
6.1 |
0.10 |
| /chitkuli1/TRD_cik06-BS1-AD-2019-03-07 |
81 |
1 |
813 |
2 |
0 |
1.1 |
3.8 |
0.06 |
| /chitkuli1/TRD_cik11-SD2-NB2-2019-04-11 |
61 |
1 |
870 |
2 |
0 |
1.1 |
4.8 |
0.08 |
Cora
Short name: cora; glottolog name: Santa Teresa Cora; glottocode: sant1424; family/type: Uto-Aztecan; macroarea: North America
URL: http://hdl.handle.net/2196/0829a3a6-92c4-4346-8e37-04845cdd1f7f

0.8 hours
| 913 |
0 |
4866 |
3300 |
1.05 |
4 |
0.78 |
1171 |
samples

2 sources
| /cora1/cora_sjc065 |
684 |
0 |
2549 |
2 |
0 |
1.0 |
24.0 |
0.40 |
| /cora1/cora_sjc106 |
229 |
0 |
2317 |
2 |
0 |
1.1 |
23.1 |
0.38 |
Croatian
Short name: croatian; glottolog name: Croatian Standard; glottocode: croa1245; family/type: Indo-European; macroarea: Eurasia
URL: https://ca.talkbank.org/access/Croatian.html

24.1 hours
| 59946 |
0 |
310196 |
1407 |
0.8 |
392 |
24.12 |
2485 |
annotation types
| [cough] |
38 |
| [sigh] |
8 |
| [yawn] |
2 |
| laugh |
734 |
| talk |
58613 |
| NA |
551 |
samples

161 sources
Showing only the first 10 sources; use allsources=T to show all
| /croatian1/2011_56 |
407 |
0 |
3235 |
3 |
407 |
0 |
-Inf |
0 |
| /croatian1/2011_57 |
438 |
0 |
2214 |
2 |
438 |
0 |
-Inf |
0 |
| /croatian1/2011_58 |
304 |
0 |
1523 |
3 |
304 |
0 |
-Inf |
0 |
| /croatian1/2011_59 |
317 |
0 |
2182 |
2 |
317 |
0 |
-Inf |
0 |
| /croatian1/2011_60 |
423 |
0 |
2486 |
3 |
423 |
0 |
-Inf |
0 |
| /croatian1/2011_61 |
561 |
0 |
3351 |
2 |
561 |
0 |
-Inf |
0 |
| /croatian1/2011_62 |
582 |
0 |
2833 |
4 |
582 |
0 |
-Inf |
0 |
| /croatian1/2011_63 |
354 |
0 |
2310 |
5 |
354 |
0 |
-Inf |
0 |
| /croatian1/2011_64 |
416 |
0 |
2071 |
3 |
416 |
0 |
-Inf |
0 |
| /croatian1/2011_65 |
87 |
0 |
618 |
2 |
87 |
0 |
-Inf |
0 |
Czech
Short name: czech; glottolog name: Czech; glottocode: czec1258; family/type: Indo-European; macroarea: Eurasia
URL: https://mirjamernestus.nl/Ernestus/NCCCz/index.php

28.7 hours
| 63826 |
0 |
354079 |
2267 |
1.4 |
3 |
28.69 |
2225 |
annotation types
| breath |
1408 |
| laugh |
10370 |
| talk |
50843 |
| NA |
1205 |
samples

19 sources
Showing only the first 10 sources; use allsources=T to show all
| /czech1/10_181108 |
3369 |
0 |
20389 |
3 |
2 |
1.5 |
90.5 |
1.51 |
| /czech1/11_181108 |
3144 |
0 |
17910 |
3 |
10 |
1.3 |
90.2 |
1.50 |
| /czech1/12_191108 |
3036 |
0 |
18983 |
3 |
0 |
1.4 |
90.2 |
1.50 |
| /czech1/13_201108 |
3545 |
0 |
17234 |
3 |
1 |
1.3 |
89.9 |
1.50 |
| /czech1/15_211108 |
4654 |
0 |
20856 |
3 |
1 |
1.6 |
90.6 |
1.51 |
| /czech1/16_211108 |
3712 |
0 |
19034 |
3 |
1 |
1.3 |
90.7 |
1.51 |
| /czech1/18_241108 |
3264 |
0 |
21347 |
3 |
3 |
1.4 |
90.6 |
1.51 |
| /czech1/19_241108 |
2770 |
0 |
17503 |
3 |
0 |
1.1 |
90.3 |
1.50 |
| /czech1/20_251108 |
4318 |
0 |
17085 |
3 |
0 |
1.5 |
90.9 |
1.52 |
| /czech1/21_261108 |
3792 |
0 |
20596 |
3 |
1 |
1.4 |
91.4 |
1.52 |
Danish
Short name: danish; glottolog name: Danish; glottocode: dani1285; family/type: Indo-European; macroarea: Eurasia
URL: https://samtalebank.talkbank.org/

3.3 hours
| 6115 |
0 |
39260 |
1418 |
0.81 |
22 |
3.34 |
1831 |
annotation types
| [sniff] |
5 |
| breath |
22 |
| laugh |
7 |
| talk |
5887 |
| NA |
194 |
samples

9 sources
| /danish1/225_deller |
1239 |
0 |
7966 |
6 |
18 |
0.6 |
50.0 |
0.83 |
| /danish1/anne_og_beate |
307 |
0 |
2811 |
2 |
21 |
0.9 |
10.1 |
0.17 |
| /danish1/gamledage |
689 |
0 |
3282 |
3 |
15 |
1.0 |
13.0 |
0.22 |
| /danish1/kartofler_og_broccoli |
824 |
0 |
5609 |
4 |
6 |
0.5 |
43.1 |
0.72 |
| /danish1/madlavning |
280 |
0 |
1980 |
3 |
2 |
0.7 |
11.9 |
0.20 |
| /danish1/omfodbold |
812 |
0 |
4132 |
4 |
54 |
1.1 |
15.2 |
0.25 |
| /danish1/politiforhoer |
168 |
0 |
1472 |
5 |
3 |
0.6 |
7.8 |
0.13 |
| /danish1/preben_og_thomas |
1015 |
0 |
7591 |
3 |
19 |
0.8 |
30.6 |
0.51 |
| /danish1/samfundskrise |
781 |
0 |
4417 |
2 |
33 |
1.1 |
18.5 |
0.31 |
Duoxu
Short name: duoxu; glottolog name: Ersu; glottocode: ersu1241; family/type: Sino-Tibetan; macroarea: Eurasia
URL: NA

0.4 hours
| 327 |
0.5 |
3128 |
3530 |
0.8 |
4 |
0.4 |
818 |
samples

2 sources
| /duoxu1/duoxu800 |
157 |
1 |
1460 |
2 |
0 |
0.7 |
11.8 |
0.2 |
| /duoxu1/duoxu801 |
170 |
0 |
1668 |
2 |
0 |
0.9 |
11.8 |
0.2 |
Dutch
Short name: dutch; glottolog name: Dutch; glottocode: dutc1256; family/type: Indo-European; macroarea: Eurasia
URL: http://hdl.handle.net/10032/tm-a2-d9

387.6 hours
| 826207 |
0 |
4736704 |
1674 |
1.01 |
1269 |
387.61 |
2132 |
annotation types
| laugh |
28742 |
| talk |
789104 |
| NA |
8361 |
samples

2787 sources
Showing only the first 10 sources; use allsources=T to show all
| /dutch1/fn000248 |
318 |
0 |
1495 |
2 |
0 |
0.9 |
8.0 |
0.13 |
| /dutch1/fn000249 |
442 |
0 |
2095 |
2 |
0 |
0.9 |
11.3 |
0.19 |
| /dutch1/fn000250 |
556 |
0 |
2773 |
2 |
0 |
0.9 |
13.5 |
0.23 |
| /dutch1/fn000251 |
684 |
0 |
3142 |
2 |
0 |
0.8 |
17.9 |
0.30 |
| /dutch1/fn000252 |
429 |
0 |
2312 |
2 |
0 |
0.8 |
13.2 |
0.22 |
| /dutch1/fn000253 |
511 |
0 |
2708 |
2 |
0 |
0.9 |
14.7 |
0.24 |
| /dutch1/fn000254 |
752 |
0 |
4023 |
3 |
0 |
1.0 |
21.2 |
0.35 |
| /dutch1/fn000259 |
437 |
0 |
2158 |
2 |
0 |
0.8 |
12.2 |
0.20 |
| /dutch1/fn000260 |
650 |
0 |
3124 |
2 |
0 |
0.9 |
16.3 |
0.27 |
| /dutch1/fn000261 |
512 |
0 |
2704 |
2 |
0 |
0.9 |
13.3 |
0.22 |
English
Short name: english; glottolog name: North American English; glottocode: nort3314; family/type: Indo-European; macroarea: North America
URL: https://ca.talkbank.org/access/CallFriend/eng-n.html

28 hours
| 55187 |
0 |
348394 |
1698 |
0.93 |
35 |
28 |
1971 |
annotation types
| [clearsthroat] |
24 |
| [cough] |
39 |
| [groan] |
3 |
| [inhales] |
24 |
| [lipsmack] |
47 |
| [sigh] |
6 |
| [sneeze] |
13 |
| [sniff] |
38 |
| [yawn] |
4 |
| breath |
1514 |
| laugh |
983 |
| talk |
51897 |
| NA |
595 |
samples

171 sources
Showing only the first 10 sources; use allsources=T to show all
| /english2/4175 |
1130 |
0 |
6114 |
4 |
65 |
0.9 |
29.9 |
0.50 |
| /english2/4504 |
266 |
0 |
1616 |
3 |
32 |
0.9 |
7.8 |
0.13 |
| /english2/4708 |
108 |
0 |
779 |
2 |
0 |
0.7 |
3.1 |
0.05 |
| /english2/4745 |
7 |
0 |
17 |
2 |
1 |
1.0 |
0.1 |
0.00 |
| /english2/4823 |
119 |
0 |
1124 |
2 |
1 |
0.5 |
5.5 |
0.09 |
| /english2/4874 |
140 |
0 |
425 |
2 |
6 |
0.8 |
2.7 |
0.05 |
| /english2/4889 |
1423 |
0 |
7630 |
3 |
58 |
1.0 |
25.3 |
0.42 |
| /english2/4984 |
1277 |
0 |
6657 |
4 |
114 |
1.0 |
30.0 |
0.50 |
| /english2/5000 |
1294 |
0 |
7351 |
2 |
90 |
1.0 |
30.0 |
0.50 |
| /english2/5051 |
72 |
0 |
516 |
2 |
13 |
1.0 |
2.8 |
0.05 |
Farsi
Short name: farsi; glottolog name: Western Farsi; glottocode: west2369; family/type: Indo-European; macroarea: Eurasia
URL: https://catalog.ldc.upenn.edu/LDC2014S01

25.3 hours
| 33045 |
0 |
239763 |
2773 |
1.03 |
8 |
25.3 |
1306 |
annotation types
| [cough] |
57 |
| [lipsmack] |
14 |
| laugh |
486 |
| talk |
32334 |
| NA |
154 |
samples

100 sources
Showing only the first 10 sources; use allsources=T to show all
| /farsi1/fa_4046 |
501 |
0 |
3622 |
2 |
0 |
0.9 |
24.6 |
0.41 |
| /farsi1/fa_4054 |
369 |
0 |
1812 |
2 |
0 |
0.4 |
26.7 |
0.44 |
| /farsi1/fa_4099 |
311 |
0 |
1928 |
2 |
0 |
0.9 |
15.8 |
0.26 |
| /farsi1/fa_4117 |
335 |
0 |
2854 |
2 |
0 |
1.1 |
20.5 |
0.34 |
| /farsi1/fa_4130 |
326 |
0 |
2514 |
2 |
0 |
1.0 |
15.1 |
0.25 |
| /farsi1/fa_4146 |
352 |
0 |
2365 |
2 |
0 |
0.8 |
17.1 |
0.28 |
| /farsi1/fa_4218 |
384 |
0 |
2445 |
2 |
0 |
1.0 |
18.8 |
0.31 |
| /farsi1/fa_4219 |
477 |
0 |
2992 |
2 |
0 |
1.0 |
19.7 |
0.33 |
| /farsi1/fa_4221 |
253 |
0 |
1704 |
2 |
0 |
1.1 |
10.1 |
0.17 |
| /farsi1/fa_4230 |
136 |
0 |
816 |
2 |
0 |
0.9 |
6.7 |
0.11 |
French
Short name: french; glottolog name: French; glottocode: stan1290; family/type: Indo-European; macroarea: Eurasia
URL: https://mirjamernestus.nl/Ernestus/NCCFr/index.php

31.4 hours
| 37692 |
0 |
402228 |
2463 |
0.81 |
40 |
31.41 |
1200 |
samples

20 sources
Showing only the first 10 sources; use allsources=T to show all
| /french2/03-12-07_1 |
1494 |
0 |
14140 |
2 |
0 |
0.6 |
88.9 |
1.48 |
| /french2/04-12-07_1 |
1742 |
0 |
19399 |
2 |
0 |
0.7 |
95.5 |
1.59 |
| /french2/05-12-07_1 |
1541 |
0 |
18107 |
2 |
0 |
0.7 |
92.9 |
1.55 |
| /french2/14-11-07_1 |
2071 |
0 |
22531 |
2 |
0 |
0.8 |
108.5 |
1.81 |
| /french2/16-11-07_1 |
1530 |
0 |
19715 |
2 |
0 |
0.9 |
90.9 |
1.52 |
| /french2/16-11-07_2 |
1472 |
0 |
17269 |
2 |
0 |
0.7 |
86.8 |
1.45 |
| /french2/20-11-07_1 |
1874 |
0 |
13923 |
2 |
0 |
0.6 |
92.1 |
1.54 |
| /french2/22-11-07_1 |
2012 |
0 |
23219 |
2 |
0 |
1.0 |
90.9 |
1.51 |
| /french2/22-11-07_2 |
2024 |
0 |
22210 |
2 |
0 |
1.0 |
90.4 |
1.51 |
| /french2/23-11-07_1 |
1776 |
0 |
16585 |
2 |
0 |
0.7 |
90.0 |
1.50 |
German
Short name: german; glottolog name: German; glottocode: stan1295; family/type: Indo-European; macroarea: Eurasia
URL: https://catalog.ldc.upenn.edu/LDC97S43

18.6 hours
| 35104 |
0 |
216674 |
1957 |
1.04 |
5 |
18.63 |
1884 |
annotation types
| [clearsthroat] |
32 |
| [cough] |
23 |
| [lipsmack] |
2 |
| [sigh] |
19 |
| [sneeze] |
2 |
| [sniff] |
7 |
| breath |
493 |
| laugh |
1254 |
| talk |
32155 |
| NA |
1117 |
samples

120 sources
Showing only the first 10 sources; use allsources=T to show all
| /german1/4002 |
339 |
0 |
2237 |
2 |
58 |
1.1 |
10.1 |
0.17 |
| /german1/4024 |
267 |
0 |
2035 |
2 |
43 |
1.0 |
10.2 |
0.17 |
| /german1/4028 |
336 |
0 |
2030 |
2 |
88 |
1.0 |
10.0 |
0.17 |
| /german1/4049 |
157 |
0 |
1055 |
2 |
30 |
1.1 |
5.0 |
0.08 |
| /german1/4073 |
263 |
0 |
1682 |
2 |
41 |
1.0 |
10.1 |
0.17 |
| /german1/4076 |
353 |
0 |
1817 |
2 |
77 |
1.0 |
10.1 |
0.17 |
| /german1/4111 |
265 |
0 |
2231 |
2 |
78 |
1.0 |
10.1 |
0.17 |
| /german1/4123 |
268 |
0 |
1839 |
2 |
62 |
1.0 |
10.0 |
0.17 |
| /german1/4124 |
96 |
0 |
965 |
2 |
18 |
1.0 |
5.1 |
0.08 |
| /german1/4287 |
275 |
0 |
1820 |
2 |
82 |
0.9 |
10.1 |
0.17 |
Bininj Gun-Wok
Short name: gunwinggu; glottolog name: Bininj Kun-Wok; glottocode: gunw1252; family/type: Gunwinyguan; macroarea: Australia
URL: https://dx.doi.org/10.4225/72/56E97A3F99539

0.2 hours
| 275 |
0.92 |
666 |
1188 |
0.5 |
7 |
0.19 |
1447 |
samples

1 sources
| /gunwinggu1/SI1-004-transcr |
275 |
0.92 |
666 |
7 |
0 |
0.5 |
11.2 |
0.19 |
Gutob
Short name: gutob; glottolog name: Bodo Gadaba; glottocode: bodo1267; family/type: Austroasiatic; macroarea: Eurasia
URL: http://hdl.handle.net/2196/f027a3a2-d38f-4428-88ec-33b46d346cb3

2.2 hours
| 4820 |
0.06 |
15643 |
1402 |
0.84 |
14 |
2.19 |
2201 |
annotation types
| [cough] |
5 |
| laugh |
55 |
| talk |
4721 |
| NA |
39 |
samples

5 sources
| /gutob1/gutob-0444-20161205_1 |
189 |
0.07 |
489 |
3 |
0 |
0.7 |
5.5 |
0.09 |
| /gutob1/Gutob-0444-20161223 |
535 |
0.07 |
2012 |
4 |
0 |
0.8 |
17.0 |
0.28 |
| /gutob1/Gutob-0444-20180307_1 |
783 |
0.08 |
2774 |
5 |
0 |
0.9 |
23.1 |
0.39 |
| /gutob1/Gutob-0444-20180307_2 |
1159 |
0.04 |
3842 |
4 |
0 |
1.0 |
23.7 |
0.40 |
| /gutob1/Gutob-0444-20180328 |
2154 |
0.06 |
6526 |
3 |
0 |
0.8 |
61.8 |
1.03 |
Hausa
Short name: hausa; glottolog name: Hausa; glottocode: haus1257; family/type: Afro-Asiatic; macroarea: Africa
URL: http://www.language-archives.org/language/hau

0.8 hours
| 3152 |
0 |
10726 |
855 |
0.92 |
2 |
0.85 |
3708 |
samples

4 sources
| /hausa1/HAU_BC_CONV_01_BOYS |
738 |
0 |
2512 |
2 |
0 |
1.0 |
10.0 |
0.17 |
| /hausa1/HAU_BC_CONV_02_BOYS |
515 |
0 |
1713 |
2 |
0 |
1.1 |
6.2 |
0.10 |
| /hausa1/HAU_BC_CONV_03_GIRLS |
354 |
0 |
1388 |
2 |
0 |
0.8 |
7.2 |
0.12 |
| /hausa1/HAU_BC_CONV_04_MEN |
1545 |
0 |
5113 |
2 |
0 |
0.8 |
27.3 |
0.46 |
Heyo
Short name: heyo; glottolog name: Heyo; glottocode: heyo1240; family/type: Nuclear Torricelli; macroarea: Papunesia
URL: https://www.elararchive.org/dk0550/

0.6 hours
| 600 |
0.98 |
3011 |
2305 |
0.6 |
4 |
0.63 |
952 |
samples

1 sources
| /heyo1/heyo048_0002 |
600 |
0.98 |
3011 |
4 |
0 |
0.6 |
38 |
0.63 |
Hungarian
Short name: hungarian; glottolog name: Hungarian; glottocode: hung1274; family/type: Uralic; macroarea: Eurasia
URL: https://hucomtech.unideb.hu/hucomtech/

49.7 hours
| 115830 |
0 |
407033 |
1486 |
0.94 |
2 |
49.72 |
2330 |
annotation types
| [cough] |
150 |
| breath |
905 |
| laugh |
1856 |
| talk |
111399 |
| NA |
1520 |
samples

224 sources
Showing only the first 10 sources; use allsources=T to show all
| /hungarian1/003mv19_F_a_v |
311 |
0 |
1087 |
2 |
0 |
1.0 |
8.7 |
0.14 |
| /hungarian1/003mv19_I_a_v |
842 |
0 |
3536 |
2 |
0 |
1.0 |
22.2 |
0.37 |
| /hungarian1/006mc22_F_a_v |
354 |
0 |
1293 |
2 |
1 |
0.9 |
12.3 |
0.21 |
| /hungarian1/006mc22_I_a_v |
727 |
0 |
2717 |
2 |
0 |
1.1 |
16.7 |
0.28 |
| /hungarian1/007mc24_F_a_v |
301 |
0 |
1046 |
2 |
0 |
0.9 |
10.6 |
0.18 |
| /hungarian1/007mc24_I_a_v |
962 |
0 |
3563 |
2 |
0 |
1.0 |
23.2 |
0.39 |
| /hungarian1/008mc20_F_a_v |
154 |
0 |
547 |
2 |
0 |
0.7 |
5.4 |
0.09 |
| /hungarian1/008mc20_I_a |
253 |
0 |
1046 |
2 |
0 |
0.9 |
7.3 |
0.12 |
| /hungarian1/012mc25_F_a |
250 |
0 |
1004 |
2 |
0 |
0.9 |
10.0 |
0.17 |
| /hungarian1/012mc25_I_a |
665 |
0 |
2291 |
2 |
0 |
0.9 |
18.4 |
0.31 |
Italian
Short name: italian; glottolog name: Italian; glottocode: ital1282; family/type: Indo-European; macroarea: Eurasia
URL: https://www.sciencedirect.com/science/article/pii/S0167639321000303

5.4 hours
| 8854 |
1 |
59610 |
2407 |
1.11 |
20 |
5.38 |
1646 |
annotation types
| [cough] |
8 |
| breath |
389 |
| laugh |
375 |
| talk |
8071 |
| NA |
11 |
samples

10 sources
| /italian1/D01_01BF47 _02BF47 |
1030 |
1 |
4732 |
2 |
1 |
0.9 |
33.0 |
0.55 |
| /italian1/D02_03BM59 _04BM56 |
1019 |
1 |
7274 |
2 |
0 |
1.2 |
37.9 |
0.63 |
| /italian1/D04_07BF55_08BF55 |
910 |
1 |
7697 |
2 |
0 |
1.2 |
37.8 |
0.63 |
| /italian1/D05_09BF52_10BF52 |
1061 |
1 |
5943 |
2 |
0 |
1.0 |
33.7 |
0.56 |
| /italian1/D06_11LF28 _12LF30 |
778 |
1 |
5388 |
2 |
0 |
1.1 |
29.3 |
0.49 |
| /italian1/D08_15BM22 _16BF22 |
902 |
1 |
6382 |
2 |
0 |
1.1 |
28.5 |
0.47 |
| /italian1/D11_21BM60 _22BM51 |
606 |
1 |
4618 |
2 |
0 |
1.0 |
32.7 |
0.54 |
| /italian1/D12_23LM30 _24LF27 |
785 |
1 |
5465 |
2 |
0 |
1.1 |
28.2 |
0.47 |
| /italian1/D13_25LF23 _26BF24 |
850 |
1 |
6086 |
2 |
0 |
1.3 |
30.0 |
0.50 |
| /italian1/D15_29BF21 _30BM23 |
913 |
1 |
6025 |
2 |
0 |
1.2 |
32.2 |
0.54 |
Japanese
Short name: japanese; glottolog name: Japanese; glottocode: nucl1643; family/type: Japonic; macroarea: Eurasia
URL: https://ca.talkbank.org/access/CallFriend/jpn.html

13.4 hours
| 32955 |
0 |
163519 |
1331 |
0.9 |
37 |
13.42 |
2456 |
annotation types
| [cough] |
10 |
| [sneeze] |
3 |
| breath |
1143 |
| laugh |
78 |
| talk |
31402 |
| NA |
319 |
samples

32 sources
Showing only the first 10 sources; use allsources=T to show all
| /japanese3/0921 |
802 |
0 |
3758 |
2 |
9 |
0.9 |
15.0 |
0.25 |
| /japanese3/1367 |
700 |
0 |
3824 |
2 |
8 |
1.0 |
15.0 |
0.25 |
| /japanese3/1605 |
1171 |
0 |
6096 |
2 |
7 |
1.0 |
29.0 |
0.48 |
| /japanese3/1612 |
803 |
0 |
3720 |
3 |
7 |
0.9 |
18.3 |
0.30 |
| /japanese3/1684 |
1228 |
0 |
6792 |
2 |
59 |
0.9 |
30.0 |
0.50 |
| /japanese3/1722 |
1332 |
0 |
5206 |
2 |
194 |
0.9 |
30.0 |
0.50 |
| /japanese3/1758 |
1319 |
0 |
5783 |
3 |
15 |
1.0 |
30.0 |
0.50 |
| /japanese3/1773 |
612 |
0 |
3269 |
2 |
2 |
0.9 |
16.3 |
0.27 |
| /japanese3/1841 |
1190 |
0 |
7125 |
3 |
5 |
0.9 |
30.0 |
0.50 |
| /japanese3/2167 |
1277 |
0 |
6117 |
3 |
13 |
1.1 |
30.0 |
0.50 |
Jejueo
Short name: jejueo; glottolog name: Jejueo; glottocode: jeju1234; family/type: Koreanic; macroarea: Eurasia
URL: https://www.elararchive.org/dk0351/

3 hours
| 4270 |
0.25 |
18719 |
2398 |
0.96 |
13 |
3.01 |
1419 |
samples
8 sources
| /jejueo1/jeju0022_edited |
681 |
0.00 |
3641 |
2 |
0 |
0.9 |
38.3 |
0.64 |
| /jejueo1/jeju0080-08 |
107 |
0.00 |
675 |
2 |
0 |
1.1 |
5.8 |
0.10 |
| /jejueo1/jeju0105 |
401 |
0.00 |
1416 |
2 |
0 |
0.9 |
15.3 |
0.25 |
| /jejueo1/jeju0116-01-02 |
313 |
0.00 |
1270 |
5 |
0 |
1.0 |
13.8 |
0.23 |
| /jejueo1/jeju0116-04-07 |
568 |
0.00 |
2546 |
4 |
0 |
0.8 |
30.9 |
0.51 |
| /jejueo1/jeju0133 |
1093 |
0.02 |
4220 |
3 |
0 |
1.1 |
35.1 |
0.59 |
| /jejueo1/jeju0162 |
923 |
1.00 |
4061 |
3 |
0 |
0.9 |
34.6 |
0.58 |
| /jejueo1/jeju0168-01_interlinearised_0002 |
184 |
1.00 |
890 |
2 |
0 |
1.0 |
6.4 |
0.11 |
Juba Creole
Short name: juba_creole; glottolog name: South Sudanese Creole Arabic; glottocode: suda1237; family/type: Afro-Asiatic; macroarea: Africa
URL: http://www.language-archives.org/language/pga

0.5 hours
| 1662 |
1 |
6266 |
865 |
0.85 |
2 |
0.46 |
3613 |
samples

2 sources
| /juba_creole1/PGA_SM_CONV_1 |
674 |
1 |
2420 |
2 |
0 |
0.8 |
11.3 |
0.19 |
| /juba_creole1/PGA_SM_CONV_2 |
988 |
1 |
3846 |
2 |
0 |
0.9 |
16.1 |
0.27 |
Kakabe
Short name: kakabe; glottolog name: Kakabe; glottocode: kaka1265; family/type: Mande; macroarea: Africa
URL: http://hdl.handle.net/2196/3015b4c3-1ffc-4cc5-8309-f05f9d4ce8b2

1.6 hours
| 1812 |
0.97 |
15708 |
3145 |
0.98 |
28 |
1.59 |
1140 |
samples

5 sources
| /kakabe1/kke-c_2013-12-07_talk-02 |
337 |
0.98 |
3135 |
8 |
0 |
1.0 |
15.6 |
0.26 |
| /kakabe1/kke-c_2013-12-07_talk-04 |
192 |
0.97 |
2344 |
4 |
0 |
1.0 |
12.4 |
0.21 |
| /kakabe1/kke-c_2013-12-21_labiko-1 |
343 |
1.00 |
3224 |
5 |
0 |
1.1 |
14.4 |
0.24 |
| /kakabe1/kke-c_2013-12-21_labiko-smithy |
257 |
1.00 |
1725 |
6 |
0 |
0.8 |
12.8 |
0.21 |
| /kakabe1/kke-c_2013-12-22_jinkoya-talk-2 |
683 |
0.90 |
5280 |
11 |
0 |
1.0 |
40.4 |
0.67 |
Kelabit
Short name: kelabit; glottolog name: Kelabit; glottocode: kela1258; family/type: Austronesian; macroarea: Papunesia
URL: https://www.elararchive.org/dk0301/

0.6 hours
| 1080 |
1 |
5733 |
2073 |
1.06 |
3 |
0.59 |
1831 |
samples

5 sources
| /kelabit1/BAR01082014CH_03 |
178 |
1 |
1019 |
3 |
0 |
0.9 |
6.9 |
0.12 |
| /kelabit1/BAR01082014CH_04 |
149 |
1 |
668 |
3 |
0 |
1.1 |
3.9 |
0.06 |
| /kelabit1/BAR08092014CH_05 |
346 |
1 |
1861 |
2 |
0 |
1.1 |
12.1 |
0.20 |
| /kelabit1/BAR08092014CH_06 |
205 |
1 |
1047 |
2 |
0 |
1.1 |
6.0 |
0.10 |
| /kelabit1/BAR17082014CH_10 |
202 |
1 |
1138 |
2 |
0 |
1.1 |
6.7 |
0.11 |
Kerinci
Short name: kerinci; glottolog name: Kerinci; glottocode: keri1250; family/type: Austronesian; macroarea: Papunesia
URL: https://archive.mpi.nl/tla/islandora/object/tla%3A1839_00_0000_0000_0022_654E_D

4.4 hours
| 12705 |
0 |
57160 |
1066 |
0.53 |
31 |
4.37 |
2907 |
annotation types
| laugh |
51 |
| talk |
12591 |
| NA |
63 |
samples

11 sources
Showing only the first 10 sources; use allsources=T to show all
| /kerinci1/KER-20070205-FAD |
1065 |
0 |
4982 |
6 |
1065 |
0.0 |
-Inf |
0.00 |
| /kerinci1/KER-20070925-FAD |
263 |
0 |
973 |
3 |
263 |
0.0 |
-Inf |
0.00 |
| /kerinci1/KER-20071018-FAD |
368 |
0 |
1636 |
5 |
368 |
0.0 |
-Inf |
0.00 |
| /kerinci1/KER-20100207-FAD |
1543 |
0 |
9466 |
4 |
1543 |
0.0 |
-Inf |
0.00 |
| /kerinci1/KER-20100210-FAD |
1369 |
0 |
6240 |
6 |
6 |
0.8 |
43.9 |
0.73 |
| /kerinci1/KER-20110611-FAD |
1502 |
0 |
5786 |
6 |
5 |
0.9 |
30.8 |
0.51 |
| /kerinci1/KER-20120129-FAD |
1902 |
0 |
6899 |
2 |
1 |
0.7 |
54.5 |
0.91 |
| /kerinci1/KER-20120201-FAD |
1289 |
0 |
4954 |
2 |
4 |
0.8 |
29.8 |
0.50 |
| /kerinci1/KER-20120206-FADb |
1808 |
0 |
9254 |
4 |
2 |
0.9 |
57.1 |
0.95 |
| /kerinci1/KER-20140807-FAD |
449 |
0 |
1708 |
5 |
0 |
0.7 |
14.2 |
0.24 |
Khinalug
Short name: khinalug; glottolog name: Khinalug; glottocode: khin1240; family/type: Nakh-Daghestanian; macroarea: Eurasia
URL: https://hdl.handle.net/1839/ c09498f1-12dc-4a7a-b21e-99a178660ff8

0.3 hours
| 328 |
0.95 |
1837 |
3678 |
0.97 |
5 |
0.35 |
937 |
samples

3 sources
| /khinalug1/Agasi02A_06_2012 |
127 |
0.93 |
778 |
2 |
0 |
0.9 |
7.1 |
0.12 |
| /khinalug1/Kamal03V_03_2013 |
151 |
0.94 |
698 |
2 |
0 |
1.0 |
10.5 |
0.17 |
| /khinalug1/Rahman02A_06_2012 |
50 |
0.98 |
361 |
2 |
0 |
1.0 |
3.3 |
0.06 |
Korean
Short name: korean; glottolog name: Korean; glottocode: kore1280; family/type: Koreanic; macroarea: Eurasia
URL: https://catalog.ldc.upenn.edu/LDC96S54

26.6 hours
| 42750 |
0 |
229721 |
2545 |
1.14 |
5 |
26.56 |
1610 |
annotation types
| [cough] |
49 |
| [lipsmack] |
70 |
| breath |
302 |
| laugh |
973 |
| talk |
40983 |
| NA |
373 |
samples

100 sources
Showing only the first 10 sources; use allsources=T to show all
| /korean1/4012 |
349 |
0 |
2315 |
2 |
0 |
1.4 |
15.8 |
0.26 |
| /korean1/4102 |
471 |
0 |
2588 |
2 |
0 |
1.2 |
15.6 |
0.26 |
| /korean1/4211 |
418 |
0 |
2302 |
2 |
0 |
1.0 |
15.4 |
0.26 |
| /korean1/4296 |
399 |
0 |
2159 |
2 |
0 |
1.0 |
15.1 |
0.25 |
| /korean1/4314 |
444 |
0 |
2444 |
2 |
0 |
1.1 |
15.3 |
0.26 |
| /korean1/4328 |
669 |
0 |
1794 |
3 |
0 |
0.9 |
16.2 |
0.27 |
| /korean1/4361 |
282 |
0 |
1973 |
2 |
0 |
1.1 |
15.6 |
0.26 |
| /korean1/4434 |
332 |
0 |
1658 |
2 |
0 |
1.1 |
16.9 |
0.28 |
| /korean1/4478 |
508 |
0 |
2413 |
2 |
0 |
1.1 |
15.0 |
0.25 |
| /korean1/4546 |
351 |
0 |
2564 |
2 |
0 |
1.4 |
15.3 |
0.25 |
Kula
Short name: kula; glottolog name: Kula (Indonesia); glottocode: kula1280; family/type: Timor-Alor-Pantar; macroarea: Papunesia
URL: https://www.elararchive.org/uncategorized/SO_0320f6f6-97d4-483f-88fa-755b4eeadc2f/?pg=1&hh_cmis_filter=imdi.writtenFileType/ELAN

2.6 hours
| 3885 |
0.53 |
16346 |
1939 |
0.78 |
8 |
2.65 |
1466 |
annotation types
| laugh |
31 |
| talk |
3742 |
| NA |
112 |
samples

13 sources
Showing only the first 10 sources; use allsources=T to show all
| /kula1/al-tpg-201310208-01_0002 |
398 |
0.66 |
1780 |
8 |
0 |
1.0 |
12.8 |
0.21 |
| /kula1/al-tpg-20131123-04_0002 |
510 |
0.55 |
2156 |
8 |
0 |
0.9 |
19.4 |
0.32 |
| /kula1/nw-tpg-20120605-01_0002 |
244 |
0.65 |
965 |
4 |
0 |
0.7 |
11.4 |
0.19 |
| /kula1/nw-tpg-20120605-02A_0002 |
190 |
0.26 |
658 |
3 |
0 |
0.8 |
5.5 |
0.09 |
| /kula1/nw-tpg-20120605-03_0002 |
610 |
0.65 |
1980 |
6 |
0 |
0.7 |
26.4 |
0.44 |
| /kula1/nw-tpg-20121021-01 |
131 |
0.42 |
427 |
6 |
0 |
0.9 |
4.0 |
0.07 |
| /kula1/nw-tpg-20121114-01 |
304 |
0.47 |
1342 |
5 |
0 |
0.7 |
10.9 |
0.18 |
| /kula1/nw-tpg-20121121-07 |
315 |
0.69 |
1429 |
7 |
0 |
0.8 |
14.5 |
0.24 |
| /kula1/nw-tpg-20121207-01 |
159 |
0.38 |
947 |
6 |
0 |
0.6 |
8.6 |
0.14 |
| /kula1/nw-tpg-20130103-04 |
135 |
0.49 |
675 |
4 |
0 |
0.9 |
6.1 |
0.10 |
Laal
Short name: laal; glottolog name: Laal; glottocode: laal1242; family/type: Laal; macroarea: Africa
URL: https://hdl.handle.net/1839/93472197-4462-489c-8cee-0d9a3587f3e5

0.4 hours
| 530 |
0.72 |
3390 |
1562 |
0.7 |
7 |
0.4 |
1325 |
samples

2 sources
| /laal1/GDM-Go_140310_F_ND2-KN2-HN1-ID1_09_Entretien-conversation |
290 |
0.85 |
2201 |
4 |
0 |
0.5 |
17.9 |
0.3 |
| /laal1/GDM-Go_20121108_F_hommes_Conversation |
240 |
0.60 |
1189 |
4 |
0 |
0.9 |
5.9 |
0.1 |
Mandarin Chinese
Short name: mandarin; glottolog name: Mandarin Chinese; glottocode: mand1415; family/type: Sino-Tibetan; macroarea: Eurasia
URL: https://catalog.ldc.upenn.edu/LDC96S34

18.6 hours
| 33490 |
0 |
253229 |
1914 |
0.97 |
8 |
18.64 |
1797 |
annotation types
| [cough] |
45 |
| [sigh] |
17 |
| laugh |
682 |
| talk |
32512 |
| NA |
234 |
samples

120 sources
Showing only the first 10 sources; use allsources=T to show all
| /mandarin2/0003 |
169 |
0 |
796 |
2 |
0 |
0.9 |
5.0 |
0.08 |
| /mandarin2/0022 |
192 |
0 |
1148 |
3 |
0 |
1.0 |
5.0 |
0.08 |
| /mandarin2/0027 |
129 |
0 |
1081 |
2 |
0 |
1.1 |
5.1 |
0.09 |
| /mandarin2/0029 |
374 |
0 |
2170 |
2 |
0 |
0.9 |
10.0 |
0.17 |
| /mandarin2/0030 |
165 |
0 |
1161 |
2 |
0 |
0.9 |
5.0 |
0.08 |
| /mandarin2/0104 |
166 |
0 |
1081 |
2 |
0 |
1.0 |
5.0 |
0.08 |
| /mandarin2/0106 |
157 |
0 |
1281 |
2 |
0 |
1.0 |
5.0 |
0.08 |
| /mandarin2/0110 |
237 |
0 |
1897 |
2 |
0 |
0.9 |
10.0 |
0.17 |
| /mandarin2/0111 |
129 |
0 |
1167 |
2 |
0 |
1.0 |
5.0 |
0.08 |
| /mandarin2/0626 |
119 |
0 |
923 |
3 |
0 |
0.9 |
5.3 |
0.09 |
Minderico
Short name: minderico; glottolog name: Minderico; glottocode: mind1263; family/type: Indo-European; macroarea: Eurasia
URL: https://hdl.handle.net/1839/f47b19bd-ac9c-434c-b559-c6ea00485f3c

0.7 hours
| 490 |
1 |
5021 |
4406 |
0.93 |
6 |
0.67 |
731 |
samples

3 sources
| /minderico1/090408atelier2_2 |
102 |
1 |
1056 |
2 |
0 |
1.0 |
7.6 |
0.13 |
| /minderico1/090424estamine_2 |
216 |
1 |
1914 |
3 |
0 |
0.8 |
18.6 |
0.31 |
| /minderico1/090913amoroso_vera_2 |
172 |
1 |
2051 |
2 |
0 |
1.0 |
13.6 |
0.23 |
Nahuatl
Short name: nahuatl; glottolog name: Tlaxcala-Puebla-Central Nahuatl; glottocode: cent2132; family/type: Uto-Aztecan; macroarea: North America
URL: http://www.openslr.org/92

43.9 hours
| 46293 |
0 |
393364 |
4344 |
1.26 |
43 |
43.88 |
1055 |
samples

299 sources
Showing only the first 10 sources; use allsources=T to show all
| /nahuatl1/Chilc_Botan_MFC307-RMM302_okwilkowit-kwaaokwilkowit-Verbenaceae_2011-07-19-f |
236 |
0 |
1685 |
3 |
0 |
1.1 |
12.8 |
0.21 |
| /nahuatl1/Chilc_Botan_RMM302-EGS301_kwaakwaanakatsitsiin-Rubiaceae_2011-07-15-f |
120 |
0 |
1248 |
3 |
0 |
1.0 |
8.4 |
0.14 |
| /nahuatl1/Chilc_Botan_RMM302-EGS301_tsotsokapahxiwit-Rubiaceae_2011-07-15-g |
164 |
0 |
1459 |
3 |
0 |
1.0 |
9.9 |
0.17 |
| /nahuatl1/Chilc_Botan_RMM302-MJS324_xaalkowit-Piperaceae_2011-07-19-m |
138 |
0 |
1121 |
3 |
0 |
1.2 |
7.8 |
0.13 |
| /nahuatl1/Chilc_Botan_RMM302-MSO325_mowih-Acanthaceae_2011-07-27-j |
91 |
0 |
1038 |
3 |
0 |
1.1 |
6.3 |
0.10 |
| /nahuatl1/Chilc_Botan_RMM302-MSO325_teenkwaakwalaxoochit-Acanthaceae_2011-07-27-k |
37 |
0 |
446 |
2 |
0 |
1.2 |
2.7 |
0.05 |
| /nahuatl1/Chilc_Botan_RMM302-MSO325_tewitsoot-Agavaceae_2011-07-27-a |
366 |
0 |
4070 |
3 |
0 |
1.2 |
24.0 |
0.40 |
| /nahuatl1/Chilc_Botan_RMM302-MSO325_xokotatopoonkowit-Acanthaceae_2011-07-27-l |
97 |
0 |
988 |
3 |
0 |
1.3 |
6.4 |
0.11 |
| /nahuatl1/Chilc_Botan_RMM302_aakiismekat-texokomekat-Vitaceae_2011-07-14-a |
201 |
0 |
1729 |
3 |
0 |
1.1 |
12.0 |
0.20 |
| /nahuatl1/Chilc_Botan_RMM302_aakwitaxoochit-teenkwaakwalaxoochit-Acanthaceae_2008-09-11-a |
55 |
0 |
590 |
3 |
0 |
1.1 |
4.3 |
0.07 |
Nasal
Short name: nasal; glottolog name: Nasal; glottocode: nasa1239; family/type: Austronesian; macroarea: Papunesia
URL: http://hdl.handle.net/2196/00-0000-0000-0010-798B-E

0.3 hours
| 907 |
0.97 |
2779 |
1066 |
0.88 |
3 |
0.32 |
2834 |
samples

4 sources
| /nasal1/NSY-20170711-C |
332 |
0.97 |
1054 |
2 |
0 |
0.9 |
7.0 |
0.12 |
| /nasal1/NSY-20170712-CA |
152 |
0.94 |
394 |
2 |
0 |
0.7 |
3.6 |
0.06 |
| /nasal1/NSY-20170719-C |
220 |
0.99 |
706 |
2 |
0 |
0.8 |
5.2 |
0.09 |
| /nasal1/NSY-20170721-C |
203 |
0.98 |
625 |
3 |
0 |
1.1 |
3.3 |
0.05 |
Nganasan
Short name: nganasan; glottolog name: Nganasan; glottocode: ngan1291; family/type: Uralic; macroarea: Eurasia
URL: https://corpora.uni-hamburg.de/hzsk/de/islandora/object/spoken-corpus:nslc-0.2

0.5 hours
| 794 |
0 |
3196 |
2215 |
1 |
9 |
0.49 |
1620 |
samples

5 sources
| /nganasan1/ChND-KES_061107_Dialog_conv |
95 |
0 |
335 |
2 |
1 |
1 |
3.2 |
0.05 |
| /nganasan1/KES-ChND_080725_Childhood_conv |
270 |
0 |
1389 |
2 |
0 |
1 |
11.9 |
0.20 |
| /nganasan1/KES-PED_080718_Dialog_conv1 |
24 |
0 |
105 |
3 |
0 |
1 |
2.2 |
0.04 |
| /nganasan1/KH-KNT_960810_Ngindjili_conv |
66 |
0 |
278 |
2 |
0 |
1 |
2.8 |
0.05 |
| /nganasan1/TTD-ChND_080719_Dialog_conv |
339 |
0 |
1089 |
2 |
0 |
1 |
9.1 |
0.15 |
N|uu
Short name: nuu; glottolog name: Ghaap-Kalahari; glottocode: nuuu1241; family/type: Tuu; macroarea: Africa
URL: http://hdl.handle.net/2196/4558585e-56ab-4e60-8d8d-5857b2bb96a3

0.6 hours
| 1210 |
0.98 |
8256 |
1255 |
0.7 |
12 |
0.63 |
1921 |
samples

2 sources
| /nuu1/NC080903-01_A-edited |
246 |
0.98 |
2116 |
4 |
2 |
0.7 |
8.8 |
0.15 |
| /nuu1/NM071213-01_A-edited |
964 |
0.99 |
6140 |
9 |
25 |
0.7 |
29.0 |
0.48 |
Okiek
Short name: okiek; glottolog name: Okiek; glottocode: okie1245; family/type: Nilotic; macroarea: Africa
URL: NA

0.2 hours
| 161 |
1 |
793 |
2780 |
0.8 |
4 |
0.16 |
1006 |
samples

1 sources
| /okiek1/okiek_conversations001_elar |
161 |
1 |
793 |
4 |
0 |
0.8 |
9.4 |
0.16 |
San Jerónimo Acazulco Otomi
Short name: otomi; glottolog name: Estado de México Otomi; glottocode: esta1236; family/type: Otomanguean; macroarea: North America
URL: http://hdl.handle.net/2196/e4af5b03-70ce-4dd3-8473-64813a515d8d

0.3 hours
| 718 |
0.99 |
3393 |
1843 |
0.93 |
7 |
0.35 |
2051 |
samples

3 sources
| /otomi1/20100712acjs-rc |
28 |
1.00 |
142 |
2 |
0 |
0.6 |
1.4 |
0.02 |
| /otomi1/20100712acpm-sm |
141 |
0.98 |
655 |
3 |
0 |
1.1 |
4.6 |
0.08 |
| /otomi1/20101010acjg-bvmil |
549 |
1.00 |
2596 |
3 |
0 |
1.1 |
15.2 |
0.25 |
Pagu
Short name: pagu; glottolog name: Pagu; glottocode: pagu1249; family/type: North Halmahera; macroarea: Papunesia
URL: https://hdl.handle.net/1839/00-0000-0000-0022-6530-D

0.7 hours
| 831 |
1 |
3185 |
1884 |
0.65 |
4 |
0.66 |
1259 |
samples

2 sources
| /pagu1/PAG-20120422 |
402 |
1 |
1606 |
2 |
0 |
0.7 |
18.6 |
0.31 |
| /pagu1/PAG-20120716 |
429 |
1 |
1579 |
2 |
0 |
0.6 |
21.2 |
0.35 |
Pite Saami
Short name: pite_saami; glottolog name: Pite Saami; glottocode: pite1240; family/type: Uralic; macroarea: Eurasia
URL: http://saami.uni-freiburg.de/psdp/

1 hours
| 1604 |
0.98 |
5964 |
1931 |
0.87 |
7 |
1.01 |
1588 |
samples

3 sources
| /pite_saami1/pit080924 |
692 |
1.00 |
2437 |
2 |
0 |
1.0 |
23.4 |
0.39 |
| /pite_saami1/pit090519 |
393 |
0.96 |
1275 |
4 |
0 |
0.7 |
15.0 |
0.25 |
| /pite_saami1/pit090702 |
519 |
0.98 |
2252 |
3 |
0 |
0.9 |
22.4 |
0.37 |
Polish
Short name: polish; glottolog name: Polish; glottocode: poli1260; family/type: Indo-European; macroarea: Eurasia
URL: http://pelcra.pl/new/spoken_corpora_50

15.8 hours
| 23851 |
0 |
123777 |
2132 |
0.9 |
87 |
15.78 |
1511 |
annotation types
| [cough] |
22 |
| [groan] |
2 |
| [sigh] |
21 |
| [sniff] |
30 |
| [yawn] |
2 |
| breath |
1645 |
| laugh |
340 |
| talk |
21208 |
| NA |
581 |
samples

28 sources
Showing only the first 10 sources; use allsources=T to show all
| /polish1/DS_001 |
1506 |
0 |
3983 |
3 |
0 |
0.9 |
31.8 |
0.53 |
| /polish1/DS_002 |
2028 |
0 |
8047 |
2 |
0 |
0.9 |
59.2 |
0.99 |
| /polish1/DS_005 |
1209 |
0 |
8617 |
3 |
0 |
1.1 |
58.5 |
0.97 |
| /polish1/DS_007 |
914 |
0 |
4743 |
2 |
0 |
0.9 |
32.9 |
0.55 |
| /polish1/DS_008 |
680 |
0 |
3988 |
2 |
0 |
1.0 |
28.3 |
0.47 |
| /polish1/DS_009 |
918 |
0 |
4265 |
2 |
0 |
0.9 |
27.9 |
0.47 |
| /polish1/DS_010 |
1016 |
0 |
4748 |
3 |
0 |
0.9 |
32.0 |
0.53 |
| /polish1/DS_011 |
500 |
0 |
3552 |
3 |
0 |
1.0 |
25.8 |
0.43 |
| /polish1/DS_012 |
818 |
0 |
2140 |
3 |
0 |
0.9 |
20.4 |
0.34 |
| /polish1/DS_013 |
689 |
0 |
4452 |
3 |
0 |
1.0 |
32.3 |
0.54 |
Sakun
Short name: sakun; glottolog name: Sukur; glottocode: suku1272; family/type: Afro-Asiatic; macroarea: Africa
URL: https://www.elararchive.org/dk0252

2 hours
| 1292 |
1 |
8519 |
2024 |
0.56 |
12 |
2 |
646 |
samples

11 sources
Showing only the first 10 sources; use allsources=T to show all
| /sakun1/baba1 |
87 |
1.00 |
818 |
2 |
0 |
0.8 |
5.9 |
0.10 |
| /sakun1/bull2 |
44 |
0.98 |
416 |
6 |
0 |
0.6 |
3.0 |
0.05 |
| /sakun1/bull3 |
28 |
1.00 |
296 |
3 |
0 |
0.9 |
1.8 |
0.03 |
| /sakun1/bull5 |
172 |
0.99 |
727 |
12 |
0 |
0.8 |
4.7 |
0.08 |
| /sakun1/cattlepen2 |
129 |
0.99 |
996 |
4 |
0 |
0.2 |
22.2 |
0.37 |
| /sakun1/newhouse2 |
43 |
1.00 |
464 |
2 |
0 |
0.9 |
2.5 |
0.04 |
| /sakun1/pottery1 |
105 |
1.00 |
662 |
5 |
0 |
0.5 |
7.3 |
0.12 |
| /sakun1/pottery2 |
313 |
1.00 |
1803 |
5 |
0 |
0.5 |
22.3 |
0.37 |
| /sakun1/ran1 |
69 |
1.00 |
514 |
5 |
0 |
0.7 |
3.3 |
0.05 |
| /sakun1/thatching1 |
73 |
1.00 |
328 |
4 |
0 |
0.1 |
14.1 |
0.23 |
Sambas
Short name: sambas; glottolog name: Kendayan-Belangin; glottocode: kend1254; family/type: Austronesian; macroarea: Papunesia
URL: https://archive.mpi.nl/tla/islandora/object/tla%3A1839_00_0000_0000_0022_5D7C_E

6.1 hours
| 51726 |
0 |
225681 |
338 |
0.17 |
45 |
6.13 |
8438 |
samples

24 sources
Showing only the first 10 sources; use allsources=T to show all
| /sambas1/SBS-20100203 |
2836 |
0 |
13286 |
3 |
2836 |
0 |
-Inf |
0 |
| /sambas1/SBS-20100222a |
1824 |
0 |
9059 |
3 |
1824 |
0 |
-Inf |
0 |
| /sambas1/SBS-20100222b |
2480 |
0 |
11258 |
4 |
2480 |
0 |
-Inf |
0 |
| /sambas1/SBS-20100301 |
595 |
0 |
2826 |
3 |
595 |
0 |
-Inf |
0 |
| /sambas1/SBS-20100303 |
3209 |
0 |
15957 |
4 |
3209 |
0 |
-Inf |
0 |
| /sambas1/SBS-20100305 |
2123 |
0 |
11376 |
5 |
2123 |
0 |
-Inf |
0 |
| /sambas1/SBS-20100609 |
828 |
0 |
3213 |
2 |
828 |
0 |
-Inf |
0 |
| /sambas1/SBS-20100617 |
2928 |
0 |
12182 |
4 |
2928 |
0 |
-Inf |
0 |
| /sambas1/SBS-20100709 |
3130 |
0 |
13427 |
4 |
3130 |
0 |
-Inf |
0 |
| /sambas1/SBS-20100710 |
2653 |
0 |
10805 |
5 |
2653 |
0 |
-Inf |
0 |
Siona
Short name: siona; glottolog name: Siona-Tetete; glottocode: sion1247; family/type: Tucanoan; macroarea: South America
URL: http://hdl.handle.net/2196/00-0000-0000-000D-EA53-3

2.9 hours
| 2420 |
0.39 |
10056 |
3344 |
0.78 |
7 |
2.9 |
834 |
samples

14 sources
Showing only the first 10 sources; use allsources=T to show all
| /siona1/20101119oispa001 |
151 |
0.97 |
813 |
2 |
0 |
1.2 |
6.2 |
0.10 |
| /siona1/20140723salsu002 |
27 |
0.26 |
175 |
2 |
0 |
0.9 |
2.2 |
0.04 |
| /siona1/20140804salsu001 |
133 |
0.27 |
421 |
2 |
0 |
0.3 |
11.8 |
0.20 |
| /siona1/20140804salsu003 |
24 |
0.33 |
62 |
2 |
0 |
0.7 |
2.0 |
0.03 |
| /siona1/20140805salsu003 |
218 |
0.41 |
873 |
2 |
0 |
0.5 |
26.6 |
0.44 |
| /siona1/20140805salsu005 |
287 |
0.78 |
1349 |
2 |
0 |
0.9 |
22.1 |
0.37 |
| /siona1/20140805salsu010 |
232 |
0.46 |
899 |
2 |
0 |
0.7 |
22.0 |
0.37 |
| /siona1/20140805salsu012 |
267 |
0.33 |
958 |
2 |
0 |
0.6 |
28.9 |
0.48 |
| /siona1/20140805salsu013 |
10 |
0.20 |
28 |
2 |
0 |
0.3 |
2.4 |
0.04 |
| /siona1/20140925salsu001 |
115 |
0.36 |
629 |
2 |
0 |
0.9 |
9.7 |
0.16 |
Siputhi
Short name: siputhi; glottolog name: Swati; glottocode: swat1243; family/type: Atlantic-Congo; macroarea: Africa
URL: http://hdl.handle.net/2196/ebca9f1e-c73c-4d22-8ed8-3abcb2d51ffa

0.3 hours
| 430 |
1 |
1753 |
2419 |
1 |
5 |
0.29 |
1483 |
samples

8 sources
| /siputhi1/20190205_1520_MAT_20200720 |
20 |
1 |
115 |
2 |
0 |
1.0 |
1.2 |
0.02 |
| /siputhi1/20190205_1609_MAT_20200928 |
23 |
1 |
208 |
2 |
0 |
1.0 |
2.4 |
0.04 |
| /siputhi1/20190207_1454_RAM_20200720 |
27 |
1 |
102 |
5 |
0 |
0.9 |
1.3 |
0.02 |
| /siputhi1/20190211_1634_MAK_20200720 |
49 |
1 |
270 |
2 |
0 |
1.0 |
2.9 |
0.05 |
| /siputhi1/20190211_1645_MAK_20200329 |
26 |
1 |
169 |
2 |
0 |
0.9 |
1.6 |
0.03 |
| /siputhi1/20190213_1307_QOI_20200928 |
192 |
1 |
607 |
2 |
0 |
1.0 |
5.6 |
0.09 |
| /siputhi1/20190213_1309_QOI_20200706 |
58 |
1 |
150 |
2 |
0 |
1.1 |
1.3 |
0.02 |
| /siputhi1/20190226_1602_MPA_20200720 |
35 |
1 |
132 |
3 |
0 |
1.1 |
1.2 |
0.02 |
Siwu
Short name: siwu; glottolog name: Siwu; glottocode: siwu1238; family/type: Atlantic-Congo; macroarea: Africa
URL: https://hdl.handle.net/1839/c410de17-81eb-4477-ae0d-d43ff1aea085

9.9 hours
| 18341 |
0.99 |
105903 |
1487 |
0.77 |
18 |
9.94 |
1845 |
annotation types
| [nod] |
8 |
| laugh |
292 |
| talk |
17798 |
| NA |
243 |
samples

7 sources
| /siwu1/Compound |
2123 |
0.96 |
11667 |
8 |
0 |
1.0 |
59.8 |
1.00 |
| /siwu1/Compound_4 |
1367 |
1.00 |
7294 |
6 |
0 |
0.7 |
64.5 |
1.08 |
| /siwu1/Maize_1 |
3865 |
0.98 |
23330 |
8 |
0 |
0.6 |
127.7 |
2.13 |
| /siwu1/Maize_3 |
1859 |
0.97 |
10910 |
8 |
0 |
0.7 |
58.0 |
0.97 |
| /siwu1/Neighbours |
4071 |
1.00 |
24347 |
8 |
0 |
1.0 |
106.4 |
1.77 |
| /siwu1/Two_men_2 |
1928 |
0.99 |
9175 |
3 |
0 |
0.6 |
59.8 |
1.00 |
| /siwu1/Two_men_3 |
3128 |
1.00 |
19180 |
7 |
0 |
0.8 |
119.6 |
1.99 |
Southern Pinghua
Short name: southern_pinghua; glottolog name: Southern Pinghua; glottocode: sout3250; family/type: Sino-Tibetan; macroarea: Eurasia
URL: NA

0.9 hours
| 510 |
0 |
9961 |
6204 |
1 |
1 |
0.88 |
580 |
samples
1 sources
| /southern_pinghua1/WCPH007_transcription_20200605 |
510 |
0 |
9961 |
1 |
0 |
1 |
53 |
0.88 |
Southern Qiang
Short name: southern_qiang; glottolog name: Southern Qiang; glottocode: sout2728; family/type: Sino-Tibetan; macroarea: Eurasia
URL: http://hdl.handle.net/2196/00-0000-0000-0012-5FAD-9

1.2 hours
| 1523 |
0 |
4972 |
1300 |
0.5 |
3 |
1.16 |
1313 |
samples

2 sources
| /southern_qiang1/YH-060 |
235 |
0 |
889 |
2 |
2 |
0.2 |
41.0 |
0.68 |
| /southern_qiang1/YH-837 |
1288 |
0 |
4083 |
3 |
0 |
0.8 |
28.6 |
0.48 |
Spanish
Short name: spanish; glottolog name: Spanish; glottocode: stan1288; family/type: Indo-European; macroarea: Eurasia
URL: https://catalog.ldc.upenn.edu/LDC96S35

27.6 hours
| 40202 |
0 |
304734 |
2558 |
1.04 |
32 |
27.63 |
1455 |
annotation types
| [clearsthroat] |
9 |
| [cough] |
6 |
| [sneeze] |
3 |
| [sniff] |
14 |
| breath |
52 |
| laugh |
588 |
| talk |
38971 |
| NA |
559 |
samples

182 sources
Showing only the first 10 sources; use allsources=T to show all
| /spanish2/0053 |
158 |
0 |
980 |
3 |
38 |
1.1 |
5.0 |
0.08 |
| /spanish2/0082 |
122 |
0 |
750 |
2 |
27 |
1.0 |
5.0 |
0.08 |
| /spanish2/0084 |
110 |
0 |
949 |
2 |
29 |
1.0 |
5.0 |
0.08 |
| /spanish2/0085 |
245 |
0 |
2254 |
2 |
37 |
1.0 |
10.2 |
0.17 |
| /spanish2/0088 |
139 |
0 |
913 |
2 |
31 |
1.0 |
5.0 |
0.08 |
| /spanish2/0096 |
270 |
0 |
1914 |
2 |
44 |
1.0 |
10.2 |
0.17 |
| /spanish2/0098 |
243 |
0 |
1770 |
3 |
73 |
1.1 |
10.0 |
0.17 |
| /spanish2/0100 |
317 |
0 |
1603 |
2 |
99 |
1.0 |
10.5 |
0.17 |
| /spanish2/0291 |
384 |
0 |
2096 |
2 |
94 |
1.0 |
11.0 |
0.18 |
| /spanish2/0616 |
302 |
0 |
1996 |
2 |
12 |
1.1 |
10.1 |
0.17 |
Tehuelche
Short name: tehuelche; glottolog name: Tehuelche; glottocode: tehu1242; family/type: Chonan; macroarea: South America
URL: http://hdl.handle.net/2196/00-0000-0000-0011-F549-B

1.5 hours
| 1562 |
0 |
6346 |
1314 |
0.4 |
4 |
1.5 |
1041 |
samples

2 sources
| /tehuelche1/tehuelche16 |
211 |
0 |
757 |
3 |
0 |
0.1 |
46.2 |
0.77 |
| /tehuelche1/tehuelche21 |
1351 |
0 |
5589 |
4 |
0 |
0.7 |
43.8 |
0.73 |
Tena Kichwa
Short name: tena_kichwa; glottolog name: Tena Lowland Quichua; glottocode: tena1240; family/type: Quechuan; macroarea: South America
URL: https://www.elararchive.org/dk0312/

1.4 hours
| 1939 |
1 |
8119 |
2258 |
0.89 |
11 |
1.36 |
1426 |
samples

8 sources
| /tena_kichwa1/ev_24052013_01 |
59 |
1 |
208 |
4 |
0 |
0.7 |
2.6 |
0.04 |
| /tena_kichwa1/in_01082013_02 |
183 |
1 |
1013 |
2 |
0 |
1.1 |
8.6 |
0.14 |
| /tena_kichwa1/in_01082013_16 |
52 |
1 |
284 |
2 |
0 |
1.0 |
3.0 |
0.05 |
| /tena_kichwa1/in_01082013_18 |
383 |
1 |
1501 |
2 |
0 |
0.8 |
15.1 |
0.25 |
| /tena_kichwa1/in_01082013_19 |
323 |
1 |
1187 |
3 |
0 |
0.8 |
12.1 |
0.20 |
| /tena_kichwa1/in_01082013_20 |
193 |
1 |
874 |
3 |
0 |
0.9 |
6.9 |
0.11 |
| /tena_kichwa1/in_01082013_21 |
386 |
1 |
1629 |
3 |
0 |
0.9 |
15.8 |
0.26 |
| /tena_kichwa1/in_02072013 |
360 |
1 |
1423 |
3 |
0 |
0.9 |
18.9 |
0.31 |
Totoli
Short name: totoli; glottolog name: Totoli; glottocode: toto1304; family/type: Austronesian; macroarea: Papunesia
URL: https://hdl.handle.net/1839/00-0000-0000-0014-C590-D

1.1 hours
| 4457 |
0.7 |
8625 |
806 |
0.89 |
35 |
1.11 |
4015 |
annotation types
| [cough] |
24 |
| talk |
3858 |
| NA |
575 |
samples

8 sources
| /totoli1/chat |
169 |
0.67 |
324 |
8 |
0 |
1.1 |
2.1 |
0.03 |
| /totoli1/Conv_Han_Salma |
215 |
0.45 |
440 |
2 |
0 |
0.7 |
4.4 |
0.07 |
| /totoli1/conversation |
654 |
0.55 |
1335 |
7 |
0 |
1.1 |
8.2 |
0.14 |
| /totoli1/conversation_2 |
1117 |
0.68 |
1978 |
6 |
0 |
0.9 |
17.2 |
0.29 |
| /totoli1/conversation_3 |
98 |
0.69 |
206 |
4 |
0 |
0.7 |
1.9 |
0.03 |
| /totoli1/language_situation |
734 |
0.82 |
1417 |
6 |
0 |
1.0 |
9.5 |
0.16 |
| /totoli1/silsilah_TTL_2 |
666 |
0.87 |
1373 |
4 |
0 |
0.8 |
9.8 |
0.16 |
| /totoli1/village_names_4 |
804 |
0.88 |
1552 |
3 |
0 |
0.8 |
13.5 |
0.23 |
Tseltal
Short name: tseltal; glottolog name: Tzeltal; glottocode: tzel1254; family/type: Mayan; macroarea: North America
URL: https://islandora-ailla.lib.utexas.edu/islandora/object/ailla%3A124445

1.7 hours
| 2666 |
1 |
12796 |
1639 |
0.67 |
33 |
1.72 |
1550 |
samples

3 sources
| /tseltal1/070627_Cancuc_panaderia_exito |
1146 |
1.00 |
5732 |
5 |
0 |
0.8 |
43.9 |
0.73 |
| /tseltal1/070728_Cancuc_paseo_a_chak_te |
1078 |
0.99 |
5075 |
6 |
0 |
0.7 |
39.3 |
0.65 |
| /tseltal1/080201_3_Tenejapa_Tajimal_Kin_Spayel__Mayil |
442 |
1.00 |
1989 |
24 |
0 |
0.5 |
20.5 |
0.34 |
Ulwa
Short name: ulwa; glottolog name: Ulwa; glottocode: ulwa1239; family/type: Misumalpan; macroarea: North America
URL: http://hdl.handle.net/2196/00-0000-0000-000F-CB61-A

2.6 hours
| 3216 |
0.8 |
20239 |
2934 |
0.97 |
5 |
2.64 |
1218 |
annotation types
| [cough] |
52 |
| [sigh] |
3 |
| [sniff] |
6 |
| [yawn] |
3 |
| laugh |
13 |
| talk |
3137 |
| NA |
2 |
samples

6 sources
| /ulwa1/ulwa014 |
1683 |
0.88 |
9151 |
2 |
0 |
1.0 |
74.4 |
1.24 |
| /ulwa1/ulwa037 |
1167 |
0.00 |
8574 |
2 |
0 |
0.9 |
65.8 |
1.10 |
| /ulwa1/ulwa038 |
109 |
0.99 |
779 |
2 |
0 |
1.0 |
5.0 |
0.08 |
| /ulwa1/ulwa040 |
54 |
1.00 |
395 |
2 |
0 |
1.0 |
2.8 |
0.05 |
| /ulwa1/ulwa041 |
66 |
0.95 |
439 |
2 |
0 |
0.9 |
3.3 |
0.06 |
| /ulwa1/ulwa042 |
137 |
0.99 |
901 |
2 |
0 |
1.0 |
6.4 |
0.11 |
Vamale
Short name: vamale; glottolog name: Vamale; glottocode: vama1243; family/type: Austronesian; macroarea: Papunesia
URL: http://hdl.handle.net/2196/044967e0-e54e-4f00-a979-fb751b2e66cf

1.3 hours
| 1507 |
0 |
9538 |
2211 |
0.72 |
13 |
1.32 |
1142 |
samples

4 sources
| /vamale1/vamale-170723_la-peche_STE |
89 |
0 |
921 |
2 |
0 |
0.8 |
5.4 |
0.09 |
| /vamale1/vamale-170731-cycle_de_vie |
548 |
0 |
3051 |
5 |
0 |
0.6 |
31.7 |
0.53 |
| /vamale1/vamale-170731-demander_main-MS |
335 |
0 |
2017 |
3 |
0 |
0.8 |
16.2 |
0.27 |
| /vamale1/vamale-190830-kito-4 |
535 |
0 |
3549 |
4 |
0 |
0.7 |
26.1 |
0.43 |
Wooi
Short name: wooi; glottolog name: Woi; glottocode: woii1237; family/type: Austronesian; macroarea: Papunesia
URL: https://hdl.handle.net/1839/eb0ab65a-e985-42d1-a9ee-fccdba47a526

0.9 hours
| 2124 |
0.6 |
5415 |
1116 |
0.71 |
64 |
0.94 |
2260 |
annotation types
| [cough] |
2 |
| laugh |
64 |
| talk |
1603 |
| NA |
455 |
samples

14 sources
Showing only the first 10 sources; use allsources=T to show all
| /wooi1/boatpreparation |
102 |
0.87 |
314 |
4 |
0 |
0.6 |
3.1 |
0.05 |
| /wooi1/BOBO_production-consumption |
192 |
0.68 |
541 |
8 |
0 |
0.8 |
5.0 |
0.08 |
| /wooi1/joking_conversation |
80 |
0.42 |
126 |
5 |
0 |
1.1 |
1.2 |
0.02 |
| /wooi1/KAPUR_production |
46 |
0.52 |
97 |
3 |
0 |
0.5 |
1.6 |
0.03 |
| /wooi1/KEPALADESA_dialog1 |
135 |
0.81 |
446 |
4 |
0 |
0.5 |
4.7 |
0.08 |
| /wooi1/kids_cleaningwell |
107 |
0.53 |
182 |
4 |
0 |
0.9 |
2.5 |
0.04 |
| /wooi1/kitchenconversation |
322 |
0.63 |
800 |
7 |
0 |
0.6 |
9.3 |
0.16 |
| /wooi1/Miosnum_dialog_female |
84 |
0.75 |
287 |
5 |
0 |
0.7 |
3.6 |
0.06 |
| /wooi1/Multilog_between_men |
56 |
0.57 |
197 |
12 |
0 |
0.8 |
1.4 |
0.02 |
| /wooi1/PAPEDA_eating1 |
299 |
0.70 |
866 |
7 |
0 |
0.7 |
6.7 |
0.11 |
Yakkha
Short name: yakkha; glottolog name: Yakkha; glottocode: yakk1236; family/type: Sino-Tibetan; macroarea: Eurasia
URL: http://hdl.handle.net/2196/d76bd932-9390-4c02-b7c9-1e8aa76b7234

0.9 hours
| 1373 |
1 |
31830 |
2590 |
1.06 |
8 |
0.92 |
1492 |
samples

5 sources
| /yakkha1/06_cvs_01 |
108 |
1 |
3050 |
2 |
0 |
0.9 |
4.1 |
0.07 |
| /yakkha1/13_cvs_02 |
123 |
1 |
4010 |
3 |
0 |
1.0 |
6.3 |
0.10 |
| /yakkha1/28_cvs_04 |
355 |
1 |
7719 |
5 |
0 |
1.1 |
14.1 |
0.24 |
| /yakkha1/29_cvs_05 |
177 |
1 |
3596 |
5 |
0 |
1.2 |
6.5 |
0.11 |
| /yakkha1/36_cvs_06 |
610 |
1 |
13455 |
3 |
0 |
1.1 |
24.2 |
0.40 |
Yali
Short name: yali; glottolog name: Pass Valley Yali; glottocode: pass1247; family/type: Nuclear Trans New Guinea; macroarea: Papunesia
URL: https://hdl.handle.net/1839/00-0000-0000-0017-EA2D-D

0.6 hours
| 1311 |
0.41 |
5394 |
1408 |
0.8 |
14 |
0.63 |
2081 |
samples

2 sources
| /yali1/conversation_1 |
538 |
0.78 |
1641 |
6 |
0 |
0.7 |
12.8 |
0.21 |
| /yali1/conversation_2 |
773 |
0.03 |
3753 |
8 |
0 |
0.9 |
25.4 |
0.42 |
Yélî Dnye
Short name: yeli_dnye; glottolog name: Yele; glottocode: yele1255; family/type: Yele; macroarea: Papunesia
URL: https://hdl.handle.net/1839/00-0000-0000-0000-C274-3

1.2 hours
| 1896 |
0 |
8708 |
1188 |
0.5 |
22 |
1.18 |
1607 |
annotation types
| [cough] |
8 |
| [nod] |
25 |
| laugh |
20 |
| talk |
1666 |
| NA |
177 |
samples

3 sources
| /yeli_dnye1/r03_v19_s2 |
652 |
0 |
2911 |
6 |
0 |
0.4 |
29.7 |
0.49 |
| /yeli_dnye1/r03_v20_s5 |
692 |
0 |
2898 |
11 |
0 |
0.5 |
24.2 |
0.40 |
| /yeli_dnye1/r03_v21_s1 |
552 |
0 |
2899 |
6 |
0 |
0.6 |
17.2 |
0.29 |
Zaar
Short name: zaar; glottolog name: Saya; glottocode: saya1246; family/type: Afro-Asiatic; macroarea: Africa
URL: http://www.language-archives.org/language/say

0.5 hours
| 1754 |
0.95 |
5608 |
818 |
0.83 |
2 |
0.49 |
3580 |
samples

3 sources
| /zaar1/SAY_BC_CONV_01 |
343 |
0.95 |
1107 |
2 |
0 |
0.9 |
5.4 |
0.09 |
| /zaar1/SAY_BC_CONV_02 |
429 |
0.96 |
1367 |
2 |
0 |
0.8 |
8.9 |
0.15 |
| /zaar1/SAY_BC_CONV_03 |
982 |
0.95 |
3134 |
2 |
0 |
0.8 |
15.0 |
0.25 |
Zacatepec_chatino
Short name: zacatepec_chatino; glottolog name: Zacatepec Chatino; glottocode: zaca1242; family/type: Otomanguean; macroarea: North America
URL: NA

1.8 hours
| 2154 |
0.45 |
24103 |
2726 |
0.82 |
6 |
1.8 |
1197 |
samples

5 sources
| /zacetepec_chatino1/zac-2011_06_03-trans_mgh_mcg-sv |
171 |
0.86 |
1066 |
2 |
0 |
0.9 |
8.2 |
0.14 |
| /zacetepec_chatino1/ZAC-2011_06_08-Trans_MGH_SC_FH-sv |
226 |
0.70 |
1311 |
2 |
0 |
0.6 |
15.4 |
0.26 |
| /zacetepec_chatino1/ZAC-2011_06_17-Trans_MGH_AMH_ED-sv |
168 |
0.21 |
1139 |
3 |
0 |
0.8 |
8.1 |
0.13 |
| /zacetepec_chatino1/ZAC-2011_06_22-Trans_MGH_AMH_IHG-sv |
274 |
0.46 |
1637 |
2 |
0 |
0.8 |
15.6 |
0.26 |
| /zacetepec_chatino1/zac-2012_07_11-trans_mgh_mbh_amp |
1315 |
0.00 |
18950 |
2 |
0 |
1.0 |
60.5 |
1.01 |
Zauzou
Short name: zauzou; glottolog name: Zauzou; glottocode: zauz1238; family/type: Sino-Tibetan; macroarea: Eurasia
URL: http://hdl.handle.net/2196/bc64e9fe-4ce0-4af7-b79d-39d73e6ff66f

1.4 hours
| 1833 |
1 |
15289 |
2488 |
0.88 |
5 |
1.42 |
1291 |
samples

10 sources
| /zauzou1/170806OuYuHua-WayToMarket1 |
69 |
1.00 |
747 |
3 |
0 |
0.9 |
3.8 |
0.06 |
| /zauzou1/170806OuYuHua-WayToMarket2 |
45 |
1.00 |
466 |
2 |
0 |
0.7 |
3.2 |
0.05 |
| /zauzou1/170908EverydayConversation-LiLiMeiHouse |
349 |
0.99 |
2717 |
3 |
0 |
0.8 |
16.4 |
0.27 |
| /zauzou1/170913YangLiZhong-RepairPipe |
382 |
0.99 |
3133 |
3 |
0 |
1.0 |
13.4 |
0.22 |
| /zauzou1/170928EverydayConversation-LiYuJiongLiShunXiang |
212 |
1.00 |
1629 |
2 |
0 |
0.9 |
9.5 |
0.16 |
| /zauzou1/180802ConversationInSnackShop |
60 |
1.00 |
530 |
2 |
0 |
0.7 |
3.5 |
0.06 |
| /zauzou1/180816ConversationBetweenRituals2-YangShuShanLiLiMei |
53 |
1.00 |
464 |
2 |
0 |
1.0 |
2.3 |
0.04 |
| /zauzou1/180816ConversationBetweenRituals3-LiLiMeiYangShuShan |
78 |
1.00 |
797 |
2 |
0 |
1.0 |
4.3 |
0.07 |
| /zauzou1/180918Conversation-LiYuJiongYangShuShan-GuestVisit01 |
326 |
1.00 |
2695 |
3 |
0 |
0.9 |
17.4 |
0.29 |
| /zauzou1/190521Conversation-ChildBirth03 |
259 |
1.00 |
2111 |
5 |
0 |
0.9 |
11.8 |
0.20 |
Examples of reasons for exclusion
While every single corpus considered here represents an immensely valuable record of communicative behaviour and linguistic resources used in interaction, differences in annotation standards make not all corpora as useful for all kinds of purposes.
For instance, a corpus might consist of a large amount of transcribed segments that can be useful for purposes relating to automatic speech recognition; but it may be mostly monologic, which makes it harder to use for the analysis of interactional infrastructure. Or a corpus make provide sufficient data to be used for some corpus linguistic analyses of broad grammatical structures, but its annotations may only be roughly aligned with the actual speech signal, making it hard to use for speech recognition or conversation analytical purposes.
In this section we discuss a number of examples of corpora along with possible reasons for excluding them from some kinds of analyses.
Duoxu
Duoxu is a small corpus (a little over 300 annotations) that is mostly monologic. While each of the sessions contains at least 2 participants (qualifying for inclusion), the actual interactions show little dyadic interaction. That only ~70 out of ~350 annotations count as transitions between participants means that most conversations consist of turns produced in succession by one participant without interactive contributions by the other.
This means that the Duoxu corpus may be useful for phonetic or morphosyntactic research, but that it doesn’t provide sufficient stretches of casual conversation to inform analyses of interactional infrastructure.
inspect_corpus("duoxu")

0.4 hours
| 327 |
0.5 |
3128 |
3530 |
0.8 |
4 |
0.4 |
818 |
samples

2 sources
| /duoxu1/duoxu800 |
157 |
1 |
1460 |
2 |
0 |
0.7 |
11.8 |
0.2 |
| /duoxu1/duoxu801 |
170 |
0 |
1668 |
2 |
0 |
0.9 |
11.8 |
0.2 |
Hungarian
Hungarian is an enormous and well-transcribed corpus, but stands out among other large corpora in having a very large amount of transitions timed at exactly 0. Over 27% of all speaker transitions are timed like this, which makes it an outlier relative to other corpora.
d %>%
filter(language %in% c("dutch","hungarian"),
participants == 2) %>%
drop_na(FTO) %>%
ggplot(aes(FTO)) +
theme_tufte() +
ggtitle("Comparing timing distributions in Dutch and Hungarian corpora") +
geom_vline(xintercept=0,colour="#cccccc") +
geom_density(trim=T) +
xlim(c(-2000,2000)) +
facet_wrap(~ language)

Nahuatl
The Nahuatl corpus originated as recordings of ethnobotanical elicitation sessions and is a formidable resource made available through OpenSLR. Both the mode of interaction and the way it has been segmented make it hard to use, without considerable additional work, for sequential or interactional analyses of joint action, timing, and turn-taking.
Many of the Nahuatl recordings are monologue (as in the two lower examples) or highly skewed dialogue with one speaker supplying ethnobotanical identifications and another speaker providing relatively minimal responses. When there is more interaction, as in the first two examples, its segmentation bears limited relation to the speech signal. Annotations are either fully overlapping or exactly non-overlapping. Partial overlaps are are.
nahuatl_uids <- c("nahuatl-041-082-141587",
"nahuatl-066-344-732468",
"nahuatl-244-109-412319",
"nahuatl-273-239-1014736")
convplot(nahuatl_uids,content=T,window=15000,dyads=T)
c(“nahuatl-041-082-141587”, “nahuatl-066-344-732468”, “nahuatl-244-109-412319”, “nahuatl-273-239-1014736”) [1] “seeing 4 dyads in 4 non-overlapping extracts” 
Akie and Mambila
Akie and Mambila are further examples of corpora in which the timing of annotations does not conform to the actual speech signal. The main observation here is that all annotations are mutually exclusive: there is never any overlap. Considering the normal distribution of turn-taking and timing in interaction, this cannot represent the actual temporal distribution of turns in the interaction, and indeed inspection of the audio recordings for these corpora shows that it does not. This means, in effect, that what is transcribed in an annotation roughly conforms to a turn a talk, but that the details of the timing of this turn, such as its duration and its precisely placement in relation to other’s turns, cannot be treated as accurate.
While these corpora do lend themselves to several forms of linguistic analysis, their method of segmentation means that it would take considerable additional work to use this data in analyses of timing and turn-taking as well as for qualitative and quantitative analysis of talk-in-interaction.
example_uids <- c("akie-1-084-198851",
"akie-1-154-328594",
"mambila-1-0156-288901",
"mambila-1-0959-1813440")
convplot(example_uids,content=T,window=15000,dyads=T)
c(“akie-1-084-198851”, “akie-1-154-328594”, “mambila-1-0156-288901”, “mambila-1-0959-1813440”) [1] “seeing 4 dyads in 4 non-overlapping extracts” 